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Abstract 


Motivated by our earlier work on Turing computabil- 
ity via neural networks[4, 3] and the results by Maass 
et.al.[14, 15] on the limit of what can be actually computed 
by neural networks when noise (or limited precision on the 
weights) is considered, we introduce the notion of Definite 
Turing Machine (DTM) and investigate some of its proper- 
ties. We associate to every Turing Machine (TM) a Finite 
State Automaton (FSA) responsible for the next state tran- 
sition and action (output and head movement). A DTM is 
TM in which its FSM is definite. A FSA is definite (of or- 
der k > 0) if its present state can be uniquely determined 
by the last k inputs. We show that DTM are strictly less 
powerful than TM but are capable to compute all simple 
functions([1]). The corresponding notion of finite-memory 
Turing machine is shown to be computationally equivalent 
to Turing machine. 


1. Introduction 


Faced with the problem of determining the computing 
power of weighted neural networks and relating them to 
conventional models of computing, researchers have come 
up with main two approaches: 


e An infinite number of neurons[7, 10, 21] or 


e a finite number of neurons but with unbounded (ratio- 
nal) weights [19, 17]. 


They all have in common the assumption that the TM 
tape, an infinite resource, should be represented inside” the 
NN. And that accounts for the need of an infinite number of 
neurons or the unbounded weights mentioned above. While 
the first approach is not biologically plausible the second 
is, due to the limits on measurements, not physically plau- 
sible. When some form of imprecision or noise due the 
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physical, biological or technological limitations is consid- 
ered, the computing power of analog NN are considerably 
decreased[14, 15]. 

We have on previous occasions [4, 3] argued against 
these approaches by claiming that the infinite tape of a 
Turing machine is an external non-intrinsic feature, which 
should not be internalized. In Turing own analysis [22], 
the tape size is merely a mathematical convenience. In fact, 
the tape is meant as an auxiliary memory for calculations by 
the machine as a counterpart to a sheet of paper for humans. 
The idea is that 


Turing machine = finite state control + infinite tape 


is a mathematical abstraction of 


Human calculator = mind + sheets of paper 


We proceed then to simulate a TM with a Discrete Time 
Recurrent Neural Network (DTRNN) leaving the TM tape 
“outside” the network regarding it as the environment with 
which the network interacts. We can have two views. In one 
view, our neural model could be seen as composed of four 
main modules: recognition, control, writing and movement 
modules. The first module recognizes the content of a tape 
cell. The result of this recognition process is then sent to the 
control module, which simulates the TM control (“mind”). 
The output of the control module is passed on to: (1) the 
writing module, which writes the symbol on the tape; and 
(2) the movement module, which moves the TM’s head left 
or right. In another more abstract view we regard our system 
as sort of Neural Turing Machine, i.e. a Turing Machine in 
which the state control is governed by a neural network. 
No matter the view, the simulation amounts to actually 
implements the Finite State Automata (FSA) (more pre- 
cisely, a sequential Mealy machine[12]) responsible for the 
next state and action (output and movement) of the TM. Af- 
ter the results in [15] which indicate that a NN can, in pres- 
ence of noise, at most implements definite automata, the 
natural question to ask is ”What is the computing power of 
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a TM with definite control?”. So we define and investigate 
the properties of these Definite Turing Machines and the re- 
lated notion of Finite-Memory Turing Machines 

For a more detailed critique to current implementations 
of TMs in Neural Networks, we refer the reader to [3]. 


2. Main Definitions 


Below we start by formalizing the intuitive notion of a 
Turing machine depicted above. Then we define the no- 
tion of Mealy machine, which is a finite state automaton 
with an output function. And finally we define the kind of 
neural networks we use, the first order recurrent neural net- 
work where the units have the logistic function as activation 
function[2]. Next we show how our simulation works and 
then we present a notion of Definite Turing Machines. 


Definition 1 A Turing Machine T over a finite alphabet X 
a quintuple T = (“7, Qr, ôT, S1, Dr), (the subscripts are 
used when necessary) where: (1) Q = {Q15+++sQal} is 
a finite set of states with a distinguished (initial) state, qr 
(usually qı); (2)6 : QXE 4 QxEx D is a function, where 
D = {-1,0,1} is the set of possible head movements. 6 is 
to be thought as a finite set of instructions and 6(q;,0) = 
(q;,0',m) means that the machine being in the state q; E€ Q 
and reading the symbol o € & from the current cell in the 
tape will take the following actions: erase o from the cell 
and write a! in its place; change the internal state from q; to 
q; and move the head position one cell to the left (m = —1), 
one to the right (m = 1) or does not move (m = 0) 


Definition 2 A Mealy machine is a sixtuple M = 
(2,T,Q,6 A, a) where: (1) Q = {m,@; aii galt is a 
finite set of states; (2) © = {01, 02,- ,a|5)} is a finite 
input alphabet; (3) T = {71, 72,--> , yr} } is a finite output 
alphabet; (4) 6 : Q x © — Q is the next state function. 
0(qj,0%) = qi should be interpreted as “the machine be- 
ing in state qj and reading symbol ox goes to state qi; (5) 
A: QxY >T. Alqj,ok) = i should be interpreted 

s ”the machine being in state q; and reading symbol oy 
writes symbol d;; and (6) qr € Q is the initial state where 
the machine is before the first symbol is read. 


Remark 1 A pair (Q, 4) is usually called a transition sys- 
tem (over X). We can sometimes refer to the transition sys- 
tem of the Mealy machine. 


Throughout let g : R — [0, 1] be the logistic map: 


1 


a 1+ exp(—z) 


In what follows the superscripts in the weights indicate 
the computation involved: for example the xu in W?™ in- 
dicates that the weight is used to compute a state (x) from 


an input (u); the x in W® (a bias) indicates that it is used 
to compute a state. So, u, x, y and z in the subscript means 
respectivelly input, state, output and hidden output layer. 


Definition 3 A deterministic time recurrent neural net- 
work or simply a neural network is a sixtuple N = 
(X,U,Y,£,h,x0) where: (1) X = [0,1]"*, the 
nx —dimensional unit cube, is the state space of the network 
and nx is the number os units or neurons; (2) U = RU is 
the set of possible input vector, with R the set of real num- 
bers and ny the number of input lines; (3) Y = [0,1}"¥ 
is the set of outputs of the network, with ny the number 
of output units; (4) f : X x U — X is the next state func- 
tion which computes a new state x[t] from the previous state 
x[t — 1] and the input just read ult]. The i—th coordinate 
of f is given by 


fiit — 1], uft]) = 
E = Wgraj- 1] + Yje Wgrujlt] + wg) 


(5)h: X xU >Y is the output function which computes 
a new output y|t] from the previous state x[t — 1] and the 
input just read u(t]. The i—th coordinate of h is given by! 


h;(x{t — 1], ul¢]) = oZ wat ]+ wW? 


where 


zilt] = 
g (Xa Walt- 1+ DH Warusi] + 7) 


(6) and finally, xo is the initial state of the network which 
simply is the value that will be used for x(0). 


3. The implementation 


The simulation can easily be grasped by first observing 
that computations in a Turing machine is actually controlled 
by a Mealy machine: 


or (q, a) = Ce d', m) 


can be seen as 
éu(q,o) = q 


1 As it is known (Gouddreau et.al.[9]), networks as above cannot repre- 
sent the output function of all Mealy machine unless a two-layer scheme 
is used. Hence we assume Z = [0, 1]”Z be the hidden output layer where 
nz is the number of such units. 
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and 
Au (4,7) = (o',m) 


i.e. given a TM T = (£r, QT, 67,81, Dr) we look at the 
“heart” of T which essentially is a Mealy machine M = 
(Èr, L, QT: ÔM, AM; S1) where [ = Er x Dr, and ÔM 
and Am are defined as above. 


Remark 2 We can now speak of the transition system of a 
Turing machine meaning the transition system of the corre- 
sponding Mealy machine. 


We now look at Carrasco et.al.[2] stable implementa- 
tion of a Mealy machine in DTRNN. From the original 
Mealy machine M they construct the split-state split-output 
Mealy machine M’ = (X7,I",Q’,6',’,q) where (1) 
IY = Tx, Q = Qx¥, and s} is any of the (s7,0),0 € X; 
(2) 6'((q,0),0") = (6u(q,0')a’) and (3) '((q,0), 0") = 
(Aq, 0"), o’). 

The general construction of M’ above usually gives rise 
to inaccessible states and outputs which can be removed 
(see Hopcroft and Ullmann[11]). Next, following Carrasco 
et. al. [2], we construct a sigmoid DTRNN simulation M 7 
using the equations below. The number of state space units, 
input lines and output units are, respectively, nx = |Q], 
ny = |È] and ny = |I|. The weights and biases for the 
next state function are given by 


wet = H if OG; ox) = qi form some gg € X 
3J — \ O otherwise 
wat = H if ô’ (q, ok) = qi form some qj € 
a 0 otherwise 
3H .. 
wF = T5” Vi=1,...,nx. 


The output function is defined by the following weight 
scheme: 


_ Jf Hif XN (gar) = yj form some og € X 
O otherwise 


we = { H if N(qj,on) = 7; form some gi, € Q' 
~ 1 0 


ij otherwise 
3H |. 
WwW; = T’ Vi = 1,...,nz. 


In its turn the output layer has the following weights: 


w= H ify; = (i, ok) form some og € X 
i ~~ | 0 otherwise 


We =-5, Vi=1,...,ny. 


a 


In order to find an appropriate value for H above take 
Cir = {110 (q, ok) = qi) 


Dir = {lly (Gi, Ye) = i? 
and 
Xe = max |Cjx|, 
j,k 


Xy = Max Dial, 
j,k 
and now 


X= max(Xz, Xy NU). 
The iteration scheme 


colt + 1] = (g = ! l 


colt] — By 1- colt] 


A few iteration starting with an intermediate value in 
(0, 1/2x) such as eo[0] = 1/4x are enough for any x > 1. 

Now that we have found €g, we can find the minimum 
value for H such that 


g (= (xe E 5) <6 


which is the used to obtain €, such that 


(e(a) > 


In Figure 1(a) we give an example of a Turing Machine 
T with 6 states and 2 input symbols which multiplies by 
two a given input in unary base. The corresponding Mealy 
machine M implementing the control of T is in Figure 1(b). 
In Figure 2(a) the split-state split-output machine S from 
M. S has states which are unreachable from the initial state 
0 and when removed give the machine in Figure 2(b). Using 
the results in the present paper we find H = 9.14211, €o = 
0.0415964 and €; = 0.988652. 

By repeating these steps starting now with a (mini- 
mal [18]) Universal Turing Machine with 24 states and 2 
input symbols (UTM(24,2)), we obtain H = 9.862505, 
co = 0.0200119 and e = 0.991678. The resulting Mealy 
machine is given in [3]. 

The number of units depends on the amount of unreach- 
able states being removed and can not obviously be gener- 
ally predicted. But we can easily find the upper bound as 
follows. The numbers of units for a Turing with m states 
and n symbols is 3n? + mn + 4n which in our case (two 
symbol TM) becomes 2(m + 10). 

We have thus sketchily proved the following: 


Theorem 1 Any Turing machine over © = {0,1} with m 
states can be simulated by a sigmoid Turing neural network 
with 2(m + 10) units. 
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6r(0,0) = (0,0,1), ôm(0,0)=0 and Am(0,0)=1 
ôr(0,1) = (1,0,1), ôm(0,1)=1 and Am(0,1)=1 
ôr(1,0) = (2,1,2), ôm(1,0)=2 and Ags(1,0) =5 
ôr(1,1) = (1,1,1), ôm(1,1)=1 and Am(1,1)=4 
67 (2,0) = (3,0,1), ôm(2,0)=3 and Ags(2,0) =1 
6r(2,1) = (4,0,1), dac(2,1)=4 and dAgs(2,1)=1 
ôr(3,0) = (0,1,0), ôm(3,0)=0 and dqs(3,0) =3 
67 (3,1) = (3,1,1), ôm(3,1)=3 and Am(3,1)=4 
ôr(4,0) = (5, 1,2), ôm (4,0) =5 and Am (4, 0) =5 
67 (4,1) = (4,1,1), 6m (4,1) =4 and Am(4,1)=4 
6r(5,0) = (2,1,2), ôm(5,0)=2 and Aqs(5,0) =5 
6r(5,1) = (5,1,2), ôm(5,1)=5 and Am(5,1)=5 


(a) (b) 


Figure 1. (a) A Turing machine T which multi- 
plies by two a given number in unary base. (b) 
The control of the TM T as a Mealy machine 
M. The output pairs (o,m),0 € ©,m € M, is 
coded as the integer z| M| + y- 


4. Definite Turing Machines 


Definition 4 ([16, 20]) A transition system (Q,6) is 
k—definite if and only if, for every word w of length 
greater than k, 6(s,w) = 6(s',w), for any pair of states 
s,s'. 


Definition 5 A Mealy machine is k—definite if, and only if, 
its transition system is. 


Definition 6 A Turing machine is k—definite if, and only if, 
its transition system is. 


There are various algorithms for deciding if a given tran- 
sition system is definite or not. We use one based on testing 
table and testing graph presented in [13]. The testing table 
has p = || columns, one for each symbol in the input al- 
phabet. Its rows are divided into two two parts, the upper 
part correspond to the states of the machine, and the table 
entries are the state transitions. The row headings in the 
lower part of the table are all unordered pairs of non equal 
sates, while the table entries are the corresponding pair of 
state transitions. The testing graph is a directed graph which 
has as vertices the row headings in the lower part of the 
testing table and there is an edge from (p, s), p,s E€ Q, to 
(p', s"), p! # s’, if, and only if, there is an entry (p’, s") in 
row (p, s) column ø € X. The edge is labelled o. No edge 
if (p, s) implies (p’, p'). The transition system is k—definite 
if, and only if, its testing graph G is loop free and length of 
longest path in G is k — 1. 

A non-definite TM is shown in Figure 3 while in Figure 4 
there is a definite one. 


5. Simple Functions 


In Computing Theory, on the class of primitive recursive 
functions we can define a hierarchy of classes of functions 


ôs(0,0)=0  As(0,0)=1 ôr(0,0)=0 Ar(0,0)=1 
ôs(0,1)=7  As(0,1)=7 ôr(0,1)=1 Ar(0,1)=7 
ôs(1,0)=2  As(1,0)=5 ôr(1,0)=2 Ar(1,0)=5 
6s(1,1)=7 Ags(1,1)=10 ôr(1,1)=1 Ar(1,1)=10 
ôs(2,0)=3  As(2,0)=1 ôr(2,0)=3 AR(2,0)=1 
ôs(2,1)=10 As(2,1)=7 ôr(23,1)=5 ARr(2,1)=7 
ôs(3,0)=0  As(3,0)=3 ôr(3,0)=0 Ar(3,0)=3 
ôs(3,1)=9  As(3,1)=10 ôr(3,1)=4 ARr(3,1)=10 
ôs(4,0)=5 As(4,0)=5 6r(4,0)=0 AR(4,0) = 3 
65(4,1)=10 Ags(4,1)=10 ôr(4,1)=4 An(4,1) =10 
6s5(5,0)=2 Ags(5,0)=5 ôr(5,0)=6 Ar(5,0)=5 
ôs(5,1)=11 As(5,1I)=11 ôr(5,1)=5 AR(5,1) = 10 
és(6,0)=0 Ag(6,0)=1 ôr(6,0)=2 Ar(6,0)=5 
ôs(6,1)=7  As(6,1)=7 ôr(6,1)=7 Ar(6,1)=11 
ôs(7,0)=2 As(7,0)=5 6r(7,0)=2 AR(7,0)=5 
5(7,1)=7 As(7,1)=10 6R(7,1)=7 AR(7,1)=11 
65(8,0)=3  As(8,0)=1 
ôs(8,1)=10 As(8,1)=7 
6s(9,0)=0  2As(9,0) =3 
65(9,1)=9  As(9,1)=10 
ôs(10,0)=5 Ag(10,0) —5 
55(10,1)=10 As(10, 1) =10 
ôs(11,0)=2 As(11,0)=5 
ôs(11,1)=11 As(11,1) = 11 


=~ 
D 
a 


(b) 


Figure 2. (a) The split-state split-output Mealy 
machine S obtained from M (Figure 1(b)). We 
have encoded pairs (q, c) into integers g+|>|« 
a. (b) The Mealy machine R obtained from S 


in (a) 


Lo Cc Ly C Lo- 


Zo C zı C zə of increasing com- 


plexity (e.g. see the textbook [1]). They are related to pro- 
grams having a maximum depth of nesting FOR statements 
in a toy programming language. £; is the class of functions 
computed by programs having a maximum depth of nesting 
i FOR statements. One should note that the whole of the 
primitive recursive functions can be computed by programs 
with no GOTO statements (see Theorem 3.3 page 53 in [1]). 

Two classes in this hierarchy are particularly interesting. 
Lı, the class of simple functions, and Ly, the elementary 


functions. For £1, the equivalence of programs is decidable 


while for all 7 > 2 the equivalence is undecidable. £5 is 
claimed to contain all real problems since a program for a 
not elementary function say g(x) must, for any k, use more 
than 


HOETA hk 


steps on infinitely many inputs. For this reason the elemen- 
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6(0,0) = (0,0,1 6(0,1) = (1,1, 2) 
6(1,0) = (2,1,1 6(1,1) = (1, 1,2) 
6(2,0) = (10,0, 1) 6(2, 1) = (3,0,1) 
6(3,0) = (4,0, 1 6(3, 1) = (3,1,1) 
6(4,0) = (4,0, 1) 6(4, 1) = (5,0, 1) 
6(5,0) = (7, 0,2 6(5, 1) = (6,1, 2) 
6(6,0) = (6, 0,2 6(6,1) = (1,1, 2) 
6(7,0) = (7,0, 2) 6(7,1) = (8,1, 2) 
6(8, 0) = (9,0, 2 6(8, 1) = (8,1, 2) 
6(9,0) = (2,0,1 6(9, 1) = (1,1, 2) 
6(10,0) = (11,0,1) 6(10,1) = (10,1, 1) 
6(11,0) = (11,0,2) 6(11,1) = (11,1, 2) 


Figure 3. A not definite Turing machine T 
which finds the gcd of two number in unary 
notation using the Euclid’s algorithm . 


6(0,0) = (9,0,0) (0,1) = (1,1, 1) 
6(1,0) = (7,0,2) (1,1) = (2,0,1) 
6(2,0) = (7,0,2) 4(2, 1) = (3,0, 1) 
(3,0) = (9,0,0) 4(3,1) = (4,1, 2) 
6(4,0) = (4,0,2) (4,1) = (5,1, 1) 
6(5,0) = (8,1,1) (5,1) = (6,1, 0) 
5(6,0) = (6,0,1) 4(6,1) = (6, 1,1) 
(7,0) =(7,0,2) 4(7,1) = (9,0, 0) 
6(8,0) = (8,0,1) 4(8, 1) = (1,0, 1) 
6(9,0) = (9,0,0) (9, 1) = (9,1, 0) 


Figure 4. A definite Turing machine 7 which 
divides a number in unary notation by 3. 


tary functions are also called practical computable func- 
tions 

In order to show that definite TM computes the class of 
simple functions, we need the following characterization of 
the class £; (see [1]). 


Theorem 2 The class of simple function is the smallest 
class which contains the basic simple functions 


d(x) = (x, x£) 
e(z, y) = (y, 2) 
(x) = 2 
p(x) = 0 
z() = (x) 


s(x) =a+1 
a+y 
z—1 
x/k for each constant k > 1 
mod(z, k) for each constant k > 1 


ifx —0 
Kenen fh ETO 


and which is closed under composition and combination. 


The Euclidean algorithm shows that the gcd @, y) is an 
elementary function and our Figure 3 shows a non definite 
TM for it. It remains to be shown that it is impossible to 


find a definite TM for the Euclidean algorithm in particular 
and the elementary functions in general. 

In contrast, for the simple functions the situation is 
straightforward. 


Theorem 3 The class L; of simple function is computable 
by definite Turing machines 


Proof: We use the characterization in Theorem 2. The 
less trivial case in the list of basic simple function are those 
which involves division: x/k and mod(x, k). But already 
in Figure 4 we show a definite TM for division by 3, which 
can obviously be extended for any fixed k > 1. The TM for 
mod(z, k) is a submachine (in the graph theoretical sense) 
of the TM for z/k and a definite transition system does not 
have non definite subsystem. 

The operations of composition and combinations are 
nothing but the serial and parallel operations, respectively, 
in [20], where is proved that they preserve definiteness. I 


6. Finite Memory Machines 


We have shown how to get a sequential machine of the 
Mealy type from a Turing machine. For Mealy machines 
there is another notion of finite memory which are also re- 
lated to the output and not only to the input as in the def- 
inite case. In a k—definite automaton the present state is 
completely determined by the last k inputs. 


Definition 7 A Mealy machine M is finite-memory ma- 
chine of order k if k is the least integer, so that the present 
state of M can be determined uniquely from the knowledge 
of the last k inputs and the corresponding k outputs. 


Definition 8 A Turing machine is finite-memory if and only 
if, its associated Mealy machine if finite-memory 


In contrast to the definite machines, which does not 
recognize all regular language, finite-memory machines 
are computationally capable of computing all regular lan- 
guages. 


Theorem 4 For any regular language L there exists a 
finite-memory machine M which recognizes L. 


Proof: Let N = (£, Q, ô, qr, F) be the finite state automa- 
ton recognizing L. Define the machine M = (X, T, ô, qr), 
where T = Q. M is obviously finite-memory of order 1 and 
the language represented by the subset F' C T of output let- 
ters is equal to L. a 


Corollary 1 Turing machines and finite-memory Turing 
machines are computationally equivalent 
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7. Final Remarks 


Motivated by our research on the relationships between 
classical and neural computation ([4, 3] and the results by 
Maass et.al.[14, 15] limiting the capabilities of analog neu- 
ral computation, we proposed a novel class of Turing ma- 
chines by restricting the type finite control of the machine. 

In this work we give the first steps towards the charac- 
terization of the class of functions computable by definite 
Turing machines. We also have proved the equivalence be- 
tween finite-memory Turing machines and classical Turing 
machines. 

We are extending the results to the Einlemberg’s X- 
machines[6], showing that definite X-machines compute the 
elementary functions[5]. 
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